Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 1776633 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 230.4 MiB |
| Average record size in memory | 136.0 B |
Variable types
| NUM | 12 |
|---|---|
| CAT | 4 |
| DATE | 1 |
Reproduction
| Analysis started | 2020-11-21 18:56:34.831341 |
|---|---|
| Analysis finished | 2020-11-21 18:59:51.900866 |
| Duration | 3 minutes and 17.07 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Time has a high cardinality: 1439 distinct values | High cardinality |
df_index has unique values | Unique |
Accident_Index has unique values | Unique |
| Distinct count | 1776633 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 890240.0305319106 |
|---|---|
| Minimum | 0 |
| Maximum | 1780651 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 88983.6 |
| Q1 | 445082 |
| median | 890219 |
| Q3 | 1335387 |
| 95-th percentile | 1691455.4 |
| Maximum | 1780651 |
| Range | 1780651 |
| Interquartile range (IQR) | 890305 |
Descriptive statistics
| Standard deviation | 513997.7269 |
|---|---|
| Coefficient of variation (CV) | 0.5773698208 |
| Kurtosis | -1.199892811 |
| Mean | 890240.0305 |
| Median Absolute Deviation (MAD) | 445153 |
| Skewness | 0.0001794270646 |
| Sum | 1.581629816e+12 |
| Variance | 2.641936633e+11 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 1573290 | 1 | < 0.1% | |
| 1679774 | 1 | < 0.1% | |
| 1681823 | 1 | < 0.1% | |
| 1593760 | 1 | < 0.1% | |
| 1595809 | 1 | < 0.1% | |
| 1591715 | 1 | < 0.1% | |
| 1601956 | 1 | < 0.1% | |
| 1604005 | 1 | < 0.1% | |
| 1597862 | 1 | < 0.1% | |
| Other values (1776623) | 1776623 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1780651 | 1 | < 0.1% | |
| 1780650 | 1 | < 0.1% | |
| 1780649 | 1 | < 0.1% | |
| 1780648 | 1 | < 0.1% | |
| 1780647 | 1 | < 0.1% |
| Distinct count | 1776633 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.6 MiB |
| 200701FH10195 | 1 |
|---|---|
| 201401CW11492 | 1 |
| 201297UD03011 | 1 |
| 201230C000077 | 1 |
| 2005070502592 | 1 |
| Other values (1776628) |
| Value | Count | Frequency (%) | |
| 200701FH10195 | 1 | < 0.1% | |
| 201401CW11492 | 1 | < 0.1% | |
| 201297UD03011 | 1 | < 0.1% | |
| 201230C000077 | 1 | < 0.1% | |
| 2005070502592 | 1 | < 0.1% | |
| 2012160D05521 | 1 | < 0.1% | |
| 2013520401026 | 1 | < 0.1% | |
| 200501ZD30252 | 1 | < 0.1% | |
| 2008930000879 | 1 | < 0.1% | |
| 2014450013162 | 1 | < 0.1% | |
| Other values (1776623) | 1776623 | > 99.9% |
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Longitude
Real number (ℝ)
| Distinct count | 1243949 |
|---|---|
| Unique (%) | 70.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -1.4285615522063362 |
|---|---|
| Minimum | -7.516225 |
| Maximum | 1.7620099999999999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | -7.516225 |
|---|---|
| 5-th percentile | -4.0589504 |
| Q1 | -2.355523 |
| median | -1.3879 |
| Q3 | -0.215937 |
| 95-th percentile | 0.5598604 |
| Maximum | 1.76201 |
| Range | 9.278235 |
| Interquartile range (IQR) | 2.139586 |
Descriptive statistics
| Standard deviation | 1.403892363 |
|---|---|
| Coefficient of variation (CV) | -0.9827314484 |
| Kurtosis | -0.357684417 |
| Mean | -1.428561552 |
| Median Absolute Deviation (MAD) | 1.108824 |
| Skewness | -0.374797897 |
| Sum | -2538029.596 |
| Variance | 1.970913768 |
| Value | Count | Frequency (%) | |
| -0.977611 | 68 | < 0.1% | |
| -1.871043 | 57 | < 0.1% | |
| -3.310596 | 48 | < 0.1% | |
| -0.104426 | 46 | < 0.1% | |
| -0.173445 | 45 | < 0.1% | |
| -1.999967 | 44 | < 0.1% | |
| -1.234393 | 43 | < 0.1% | |
| -3.241694 | 43 | < 0.1% | |
| -1.216694 | 42 | < 0.1% | |
| -0.816789 | 42 | < 0.1% | |
| Other values (1243939) | 1776155 | > 99.9% |
| Value | Count | Frequency (%) | |
| -7.516225 | 1 | < 0.1% | |
| -7.515933 | 1 | < 0.1% | |
| -7.509162 | 1 | < 0.1% | |
| -7.507468 | 1 | < 0.1% | |
| -7.507207 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1.76201 | 1 | < 0.1% | |
| 1.759398 | 1 | < 0.1% | |
| 1.759382 | 2 | < 0.1% | |
| 1.758797 | 1 | < 0.1% | |
| 1.758722 | 1 | < 0.1% |
Latitude
Real number (ℝ≥0)
| Distinct count | 1166816 |
|---|---|
| Unique (%) | 65.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 52.57375245386696 |
|---|---|
| Minimum | 49.912941 |
| Maximum | 60.757543999999996 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 49.912941 |
|---|---|
| 5-th percentile | 50.8234186 |
| Q1 | 51.487541 |
| median | 52.268092 |
| Q3 | 53.464518 |
| 95-th percentile | 55.8368152 |
| Maximum | 60.757544 |
| Range | 10.844603 |
| Interquartile range (IQR) | 1.976977 |
Descriptive statistics
| Standard deviation | 1.451980973 |
|---|---|
| Coefficient of variation (CV) | 0.02761798245 |
| Kurtosis | 0.8039279664 |
| Mean | 52.57375245 |
| Median Absolute Deviation (MAD) | 0.880182 |
| Skewness | 1.018398657 |
| Sum | 93404263.54 |
| Variance | 2.108248745 |
| Value | Count | Frequency (%) | |
| 52.458798 | 74 | < 0.1% | |
| 52.949719 | 68 | < 0.1% | |
| 51.519764 | 49 | < 0.1% | |
| 51.506693 | 48 | < 0.1% | |
| 51.526956 | 45 | < 0.1% | |
| 52.470689 | 44 | < 0.1% | |
| 52.989857 | 43 | < 0.1% | |
| 51.482076 | 42 | < 0.1% | |
| 52.47217 | 42 | < 0.1% | |
| 54.96844 | 42 | < 0.1% | |
| Other values (1166806) | 1776136 | > 99.9% |
| Value | Count | Frequency (%) | |
| 49.912941 | 1 | < 0.1% | |
| 49.913077 | 1 | < 0.1% | |
| 49.914145 | 1 | < 0.1% | |
| 49.91443 | 1 | < 0.1% | |
| 49.914488 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 60.757544 | 1 | < 0.1% | |
| 60.724682 | 1 | < 0.1% | |
| 60.714774 | 1 | < 0.1% | |
| 60.714772 | 1 | < 0.1% | |
| 60.668921 | 1 | < 0.1% |
Police_Force
Real number (ℝ≥0)
| Distinct count | 51 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 30.745305867897308 |
|---|---|
| Minimum | 1 |
| Maximum | 98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 7 |
| median | 31 |
| Q3 | 46 |
| 95-th percentile | 94 |
| Maximum | 98 |
| Range | 97 |
| Interquartile range (IQR) | 39 |
Descriptive statistics
| Standard deviation | 25.52641903 |
|---|---|
| Coefficient of variation (CV) | 0.8302541903 |
| Kurtosis | 0.3125859326 |
| Mean | 30.74530587 |
| Median Absolute Deviation (MAD) | 19 |
| Skewness | 0.8381922888 |
| Sum | 54623125 |
| Variance | 651.5980684 |
| Value | Count | Frequency (%) | |
| 1 | 264636 | 14.9% | |
| 20 | 74447 | 4.2% | |
| 43 | 65998 | 3.7% | |
| 13 | 65749 | 3.7% | |
| 6 | 64754 | 3.6% | |
| 46 | 56080 | 3.2% | |
| 44 | 54343 | 3.1% | |
| 50 | 50939 | 2.9% | |
| 4 | 49317 | 2.8% | |
| 97 | 49296 | 2.8% | |
| Other values (41) | 981074 | 55.2% |
| Value | Count | Frequency (%) | |
| 1 | 264636 | 14.9% | |
| 3 | 16079 | 0.9% | |
| 4 | 49317 | 2.8% | |
| 5 | 37150 | 2.1% | |
| 6 | 64754 | 3.6% |
| Value | Count | Frequency (%) | |
| 98 | 4087 | 0.2% | |
| 97 | 49296 | 2.8% | |
| 96 | 6382 | 0.4% | |
| 95 | 25869 | 1.5% | |
| 94 | 5820 | 0.3% |
Accident_Severity
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.6 MiB |
| 3 | |
|---|---|
| 2 | 241436 |
| 1 | 22856 |
| Value | Count | Frequency (%) | |
| 3 | 1512341 | 85.1% | |
| 2 | 241436 | 13.6% | |
| 1 | 22856 | 1.3% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Number_of_Vehicles
Real number (ℝ≥0)
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.8300661982525372 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.700601846 |
|---|---|
| Coefficient of variation (CV) | 0.3828286904 |
| Kurtosis | 5.130288277 |
| Mean | 1.830066198 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.262738655 |
| Sum | 3251356 |
| Variance | 0.4908429466 |
| Value | Count | Frequency (%) | |
| 2 | 1056778 | 59.5% | |
| 1 | 537984 | 30.3% | |
| 3 | 141915 | 8.0% | |
| 4 | 30182 | 1.7% | |
| 5 | 6687 | 0.4% | |
| 6 | 2039 | 0.1% | |
| 7 | 710 | < 0.1% | |
| 8 | 338 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 537984 | 30.3% | |
| 2 | 1056778 | 59.5% | |
| 3 | 141915 | 8.0% | |
| 4 | 30182 | 1.7% | |
| 5 | 6687 | 0.4% |
| Value | Count | Frequency (%) | |
| 8 | 338 | < 0.1% | |
| 7 | 710 | < 0.1% | |
| 6 | 2039 | 0.1% | |
| 5 | 6687 | 0.4% | |
| 4 | 30182 | 1.7% |
Number_of_Casualties
Real number (ℝ≥0)
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3435718012667783 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.7560019664 |
|---|---|
| Coefficient of variation (CV) | 0.5626807333 |
| Kurtosis | 11.74557609 |
| Mean | 1.343571801 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.00196662 |
| Sum | 2387034 |
| Variance | 0.5715389731 |
| Value | Count | Frequency (%) | |
| 1 | 1364911 | 76.8% | |
| 2 | 284539 | 16.0% | |
| 3 | 81106 | 4.6% | |
| 4 | 29249 | 1.6% | |
| 5 | 10821 | 0.6% | |
| 6 | 4019 | 0.2% | |
| 7 | 1392 | 0.1% | |
| 8 | 596 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1364911 | 76.8% | |
| 2 | 284539 | 16.0% | |
| 3 | 81106 | 4.6% | |
| 4 | 29249 | 1.6% | |
| 5 | 10821 | 0.6% |
| Value | Count | Frequency (%) | |
| 8 | 596 | < 0.1% | |
| 7 | 1392 | 0.1% | |
| 6 | 4019 | 0.2% | |
| 5 | 10821 | 0.6% | |
| 4 | 29249 | 1.6% |
Date
Date
| Distinct count | 4017 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.6 MiB |
| Minimum | 2005-01-01 00:00:00 |
|---|---|
| Maximum | 2015-12-31 00:00:00 |
Day_of_Week
Real number (ℝ≥0)
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.115380610401811 |
|---|---|
| Minimum | 1 |
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 7 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.923748326 |
|---|---|
| Coefficient of variation (CV) | 0.4674533191 |
| Kurtosis | -1.187251336 |
| Mean | 4.11538061 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.06485006723 |
| Sum | 7311521 |
| Variance | 3.70080762 |
| Value | Count | Frequency (%) | |
| 6 | 290694 | 16.4% | |
| 4 | 267780 | 15.1% | |
| 5 | 266912 | 15.0% | |
| 3 | 266112 | 15.0% | |
| 2 | 252678 | 14.2% | |
| 7 | 237588 | 13.4% | |
| 1 | 194869 | 11.0% |
| Value | Count | Frequency (%) | |
| 1 | 194869 | 11.0% | |
| 2 | 252678 | 14.2% | |
| 3 | 266112 | 15.0% | |
| 4 | 267780 | 15.1% | |
| 5 | 266912 | 15.0% |
| Value | Count | Frequency (%) | |
| 7 | 237588 | 13.4% | |
| 6 | 290694 | 16.4% | |
| 5 | 266912 | 15.0% | |
| 4 | 267780 | 15.1% | |
| 3 | 266112 | 15.0% |
| Distinct count | 1439 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.6 MiB |
| 17:00 | 17309 |
|---|---|
| 17:30 | 16496 |
| 16:00 | 15852 |
| 18:00 | 15682 |
| 15:30 | 15526 |
| Other values (1434) |
| Value | Count | Frequency (%) | |
| 17:00 | 17309 | 1.0% | |
| 17:30 | 16496 | 0.9% | |
| 16:00 | 15852 | 0.9% | |
| 18:00 | 15682 | 0.9% | |
| 15:30 | 15526 | 0.9% | |
| 16:30 | 15044 | 0.8% | |
| 15:00 | 13910 | 0.8% | |
| 08:30 | 13885 | 0.8% | |
| 13:00 | 12580 | 0.7% | |
| 18:30 | 12554 | 0.7% | |
| Other values (1429) | 1627795 | 91.6% |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Road_Type
Real number (ℝ≥0)
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.167390226343876 |
|---|---|
| Minimum | 1 |
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 6 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.644681763 |
|---|---|
| Coefficient of variation (CV) | 0.3182809292 |
| Kurtosis | 0.7315078305 |
| Mean | 5.167390226 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -1.390747205 |
| Sum | 9180556 |
| Variance | 2.704978101 |
| Value | Count | Frequency (%) | |
| 6 | 1329635 | 74.8% | |
| 3 | 262182 | 14.8% | |
| 1 | 119167 | 6.7% | |
| 2 | 36656 | 2.1% | |
| 7 | 18608 | 1.0% | |
| 9 | 10385 | 0.6% |
| Value | Count | Frequency (%) | |
| 1 | 119167 | 6.7% | |
| 2 | 36656 | 2.1% | |
| 3 | 262182 | 14.8% | |
| 6 | 1329635 | 74.8% | |
| 7 | 18608 | 1.0% |
| Value | Count | Frequency (%) | |
| 9 | 10385 | 0.6% | |
| 7 | 18608 | 1.0% | |
| 6 | 1329635 | 74.8% | |
| 3 | 262182 | 14.8% | |
| 2 | 36656 | 2.1% |
Speed_limit
Real number (ℝ≥0)
| Distinct count | 9 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.02097957203316 |
|---|---|
| Minimum | 0 |
| Maximum | 70 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 30 |
| Q1 | 30 |
| median | 30 |
| Q3 | 50 |
| 95-th percentile | 70 |
| Maximum | 70 |
| Range | 70 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 14.15344896 |
|---|---|
| Coefficient of variation (CV) | 0.3627138303 |
| Kurtosis | -0.4062700551 |
| Mean | 39.02097957 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.098379489 |
| Sum | 69325960 |
| Variance | 200.3201175 |
| Value | Count | Frequency (%) | |
| 30 | 1139301 | 64.1% | |
| 60 | 281623 | 15.9% | |
| 40 | 146003 | 8.2% | |
| 70 | 129278 | 7.3% | |
| 50 | 58390 | 3.3% | |
| 20 | 22002 | 1.2% | |
| 10 | 19 | < 0.1% | |
| 15 | 16 | < 0.1% | |
| 0 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 10 | 19 | < 0.1% | |
| 15 | 16 | < 0.1% | |
| 20 | 22002 | 1.2% | |
| 30 | 1139301 | 64.1% |
| Value | Count | Frequency (%) | |
| 70 | 129278 | 7.3% | |
| 60 | 281623 | 15.9% | |
| 50 | 58390 | 3.3% | |
| 40 | 146003 | 8.2% | |
| 30 | 1139301 | 64.1% |
Light_Conditions
Real number (ℝ≥0)
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.9502750427353315 |
|---|---|
| Minimum | 1 |
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.647988079 |
|---|---|
| Coefficient of variation (CV) | 0.8450029061 |
| Kurtosis | 0.5814640214 |
| Mean | 1.950275043 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.40133637 |
| Sum | 3464923 |
| Variance | 2.715864708 |
| Value | Count | Frequency (%) | |
| 1 | 1301556 | 73.3% | |
| 4 | 349036 | 19.6% | |
| 6 | 98728 | 5.6% | |
| 7 | 19145 | 1.1% | |
| 5 | 8168 | 0.5% |
| Value | Count | Frequency (%) | |
| 1 | 1301556 | 73.3% | |
| 4 | 349036 | 19.6% | |
| 5 | 8168 | 0.5% | |
| 6 | 98728 | 5.6% | |
| 7 | 19145 | 1.1% |
| Value | Count | Frequency (%) | |
| 7 | 19145 | 1.1% | |
| 6 | 98728 | 5.6% | |
| 5 | 8168 | 0.5% | |
| 4 | 349036 | 19.6% | |
| 1 | 1301556 | 73.3% |
Weather_Conditions
Real number (ℝ≥0)
| Distinct count | 9 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5680875003447532 |
|---|---|
| Minimum | 1 |
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.625423403 |
|---|---|
| Coefficient of variation (CV) | 1.036564224 |
| Kurtosis | 11.45094836 |
| Mean | 1.5680875 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.481336994 |
| Sum | 2785916 |
| Variance | 2.642001238 |
| Value | Count | Frequency (%) | |
| 1 | 1421543 | 80.0% | |
| 2 | 210300 | 11.8% | |
| 8 | 39045 | 2.2% | |
| 9 | 32281 | 1.8% | |
| 5 | 25826 | 1.5% | |
| 4 | 23284 | 1.3% | |
| 3 | 12390 | 0.7% | |
| 7 | 9664 | 0.5% | |
| 6 | 2300 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1421543 | 80.0% | |
| 2 | 210300 | 11.8% | |
| 3 | 12390 | 0.7% | |
| 4 | 23284 | 1.3% | |
| 5 | 25826 | 1.5% |
| Value | Count | Frequency (%) | |
| 9 | 32281 | 1.8% | |
| 8 | 39045 | 2.2% | |
| 7 | 9664 | 0.5% | |
| 6 | 2300 | 0.1% | |
| 5 | 25826 | 1.5% |
Road_Surface_Conditions
Real number (ℝ≥0)
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3615670765993877 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 13.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.6184766256 |
|---|---|
| Coefficient of variation (CV) | 0.4542388225 |
| Kurtosis | 6.156321541 |
| Mean | 1.361567077 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.158361787 |
| Sum | 2419005 |
| Variance | 0.3825133364 |
| Value | Count | Frequency (%) | |
| 1 | 1225326 | 69.0% | |
| 2 | 501335 | 28.2% | |
| 4 | 35935 | 2.0% | |
| 3 | 11458 | 0.6% | |
| 5 | 2579 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1225326 | 69.0% | |
| 2 | 501335 | 28.2% | |
| 3 | 11458 | 0.6% | |
| 4 | 35935 | 2.0% | |
| 5 | 2579 | 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 2579 | 0.1% | |
| 4 | 35935 | 2.0% | |
| 3 | 11458 | 0.6% | |
| 2 | 501335 | 28.2% | |
| 1 | 1225326 | 69.0% |
Urban_or_Rural_Area
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.6 MiB |
| 1 | |
|---|---|
| 2 | |
| 3 | 36 |
| Value | Count | Frequency (%) | |
| 1 | 1144275 | 64.4% | |
| 2 | 632322 | 35.6% | |
| 3 | 36 | < 0.1% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | Accident_Index | Longitude | Latitude | Police_Force | Accident_Severity | Number_of_Vehicles | Number_of_Casualties | Date | Day_of_Week | Time | Road_Type | Speed_limit | Light_Conditions | Weather_Conditions | Road_Surface_Conditions | Urban_or_Rural_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 200501BS00001 | -0.191170 | 51.489096 | 1 | 2 | 1 | 1 | 2005-01-04 | 3 | 17:42 | 6 | 30 | 1 | 2 | 2 | 1 |
| 1 | 1 | 200501BS00002 | -0.211708 | 51.520075 | 1 | 3 | 1 | 1 | 2005-01-05 | 4 | 17:36 | 3 | 30 | 4 | 1 | 1 | 1 |
| 2 | 2 | 200501BS00003 | -0.206458 | 51.525301 | 1 | 3 | 2 | 1 | 2005-01-06 | 5 | 00:15 | 6 | 30 | 4 | 1 | 1 | 1 |
| 3 | 3 | 200501BS00004 | -0.173862 | 51.482442 | 1 | 3 | 1 | 1 | 2005-01-07 | 6 | 10:35 | 6 | 30 | 1 | 1 | 1 | 1 |
| 4 | 4 | 200501BS00005 | -0.156618 | 51.495752 | 1 | 3 | 1 | 1 | 2005-01-10 | 2 | 21:13 | 6 | 30 | 7 | 1 | 2 | 1 |
| 5 | 5 | 200501BS00006 | -0.203238 | 51.515540 | 1 | 3 | 2 | 1 | 2005-01-11 | 3 | 12:40 | 6 | 30 | 1 | 2 | 2 | 1 |
| 6 | 6 | 200501BS00007 | -0.211277 | 51.512695 | 1 | 3 | 2 | 1 | 2005-01-13 | 5 | 20:40 | 6 | 30 | 4 | 1 | 1 | 1 |
| 7 | 7 | 200501BS00009 | -0.187623 | 51.502260 | 1 | 3 | 1 | 2 | 2005-01-14 | 6 | 17:35 | 3 | 30 | 1 | 1 | 1 | 1 |
| 8 | 8 | 200501BS00010 | -0.167342 | 51.483420 | 1 | 3 | 2 | 2 | 2005-01-15 | 7 | 22:43 | 6 | 30 | 4 | 1 | 1 | 1 |
| 9 | 9 | 200501BS00011 | -0.206531 | 51.512443 | 1 | 3 | 2 | 5 | 2005-01-15 | 7 | 16:00 | 6 | 30 | 1 | 1 | 1 | 1 |
Last rows
| df_index | Accident_Index | Longitude | Latitude | Police_Force | Accident_Severity | Number_of_Vehicles | Number_of_Casualties | Date | Day_of_Week | Time | Road_Type | Speed_limit | Light_Conditions | Weather_Conditions | Road_Surface_Conditions | Urban_or_Rural_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1776623 | 1780642 | 2015984134815 | -3.415353 | 55.257772 | 98 | 3 | 1 | 1 | 2015-10-28 | 4 | 19:00 | 6 | 60 | 6 | 1 | 1 | 2 |
| 1776624 | 1780643 | 2015984135815 | -2.958098 | 55.077953 | 98 | 3 | 2 | 1 | 2015-11-20 | 6 | 16:55 | 6 | 60 | 6 | 2 | 2 | 2 |
| 1776625 | 1780644 | 2015984136815 | -3.177818 | 54.985933 | 98 | 3 | 1 | 2 | 2015-11-24 | 3 | 20:03 | 6 | 30 | 4 | 1 | 2 | 2 |
| 1776626 | 1780645 | 2015984137515 | -3.136722 | 54.992202 | 98 | 3 | 2 | 1 | 2015-12-01 | 3 | 17:15 | 6 | 60 | 6 | 4 | 2 | 2 |
| 1776627 | 1780646 | 2015984137615 | -3.262676 | 54.987365 | 98 | 3 | 2 | 1 | 2015-12-02 | 4 | 16:30 | 6 | 30 | 4 | 2 | 2 | 2 |
| 1776628 | 1780647 | 2015984139015 | -3.499388 | 55.106659 | 98 | 3 | 1 | 1 | 2015-12-13 | 1 | 02:30 | 6 | 60 | 6 | 7 | 4 | 2 |
| 1776629 | 1780648 | 2015984139115 | -3.376671 | 55.023855 | 98 | 3 | 3 | 1 | 2015-12-11 | 6 | 13:24 | 6 | 60 | 1 | 1 | 2 | 2 |
| 1776630 | 1780649 | 2015984139715 | -3.242159 | 55.016316 | 98 | 3 | 2 | 1 | 2015-12-02 | 4 | 13:50 | 6 | 60 | 1 | 1 | 2 | 2 |
| 1776631 | 1780650 | 2015984140215 | -3.387067 | 55.163502 | 98 | 2 | 1 | 4 | 2015-12-23 | 4 | 00:01 | 3 | 70 | 6 | 4 | 2 | 2 |
| 1776632 | 1780651 | 2015984140515 | -3.123385 | 55.020580 | 98 | 3 | 3 | 3 | 2015-12-26 | 7 | 12:40 | 6 | 60 | 1 | 2 | 2 | 2 |